Securing Personally Identifiable Information (PII) in DPS

Organizations are often required to use production data for development, testing, analytics, or training models for machine learning. This data can contain Personally Identifiable Information (PII), Protected Health Information (PHI), or data related to Payment Card Industry (PCI). Sensitive data is especially vulnerable during collection, transformation, transmission, or temporary storage. As a security best practice, DPS lets you mask and scramble sensitive data in order to ensure that it is exposed during data processing.

 

The Secure PII feature in Calibo Accelerate safeguards sensitive information and helps your organization maintain compliance. It prevents unintended data exposure when production data is used in lower environments, while still providing realistic, high-quality datasets for testing and analytics without legal or ethical risk. By embedding security principles into the core of the platform, it strengthens data governance and builds trust across the organization.

Industry use cases for securing PII

Here are some industry use cases where securing sensitive data is essential.

Banking/ Finance: Mask account numbers or scramble transaction references before sharing data with offshore testing teams.

  • Healthcare: Scramble patient identifiers during analytics runs while preserving the distribution of values for statistical accuracy.

  • Retail/ E-commerce: Mask credit card details or loyalty IDs while enabling recommendation engines to run on anonymized data.

  • Insurance: Scramble claims data fields so actuaries can model risks without exposing sensitive customer data.

Scrambling

Scrambling jumbles the characters in the selected column of a dataset without preserving the original structure or meaning. It helps randomize sensitive data while preserving the original length.

Scrambling Strategy Description
Full String Scramble

Randomizes all characters in the entire string, including spaces and punctuations.

Example

  • Input: John Doe

  • Output: eD ohJn

Scramble Within Word

Scrambles characters inside each word, preserving word boundaries and original word order.

Example

  • Input: Patient reports fever today

  • Output: Peatnit rerpots feevr tdoay

Scramble Word Order in Sentence

Shuffles the order of words in a sentence, keeping characters within each word unchanged.

Example

  • Input: This loan is approved today

  • Output: today approved loan is This

Scramble Middle Characters of Each Word

Scrambles only the middle characters of each word, keeping the first and last characters intact.

Example

  • Input: John Carter

  • Output: Jnho Craetr

Scramble N Random Pairs

Swaps N random pairs of characters within the string.

Example

  • Input: A2459B7712

  • Output: A2594B7172

Scramble Every Nth Character

Scrambles every nth character in the string while keeping others in place.

Example

  • Input: 221B Baker Street

  • Output: 22X B BaY er S Z reet

Reverse String Scramble

Reverse the entire string from end to start.

Example

  • Input: Seattle

  • Output: Elttaes

Scramble Based on Word Length

Scrambles only the words that meet the specified length condition.

Example

In this type, words longer than 5 characters are scrambled, while shorter words remain unchanged.

  • Input: Customer requested immediate delivery

  • Output: Tumecosr reuqestde immdieate delreviy

Scramble N Percentage

Scrambles a defined percentage of characters in the string.

Example

In this example 40% characters are scrambled - 4 characters

  • Input: 9876543210

  • Output: 9854673210

Block Scramble

Divides the string into blocks of fixed size and scrambles characters within each block.

Example

In this example, block size is 4 and characters are scrambled within each block.

  • Input: calibo-support-team

  • Output: laicos-bopupr-ttaem

Scramble All Except First N Characters

Scramble all characters except the first N characters.

Example

In this example the first 3 characters re kept intact and the rest are scrambled.

  • Input: ACC123456789

  • Output: ACC987654321

Scramble All Except Last N Characters

Scramble all characters except the last N characters.

Example

In this example the last 4 characters are kept intact and the rest are scrambled.

  • Input: john.doe@example.com

  • Output: mepxael@eod.nhoj.com

Pattern-Based Scramble

Scrambles only characters in a specified pattern. Maintains format-like patterns (e.g., phone numbers, license plates) while masking sensitive segments.

Example

  • Input: +1-415-552-7890 (Original phone number)

  • Output: +1-814-255-9708 (Scrambled phone number)

Masking

Masking replaces characters in the selected column with system-defined or custom characters, based on the chosen rule. It retains the original format and structure of the data while ensuring sensitive details remain hidden.

Masking Strategy Description
Mask Entire Value

Replaces the entire value with a masking character.

Example - hiding sensitive fields like tokens or passwords

  • Input: A9F34KD82L

  • Output: **********

Mask Digits Only

Masks only the numeric characters.

Example

  • Input: AB-4589-ZT

  • Output: AB-****-ZT

Mask Every Nth Character

Masks every nth character in the string.

Example

In this example we mask every 3rd character.

  • Input: Seattleoffice

  • Output: Se*tt*eo*fi*e

Mask First N Characters

Masks the first N characters in each value.

Example

In this example, we mask the first 4 characters.

  • Input: 9876543210

  • Output: ****543210

Mask Last N Characters

Masks the last N characters in each value.

Example

In this example, we mask the last 8 characters.

  • Input: john.doe@example.com

  • Output: john.doe@exa********

Mask All Except First N Characters

Keeps the first N characters visible, and masks the rest of the characters.

Example

In this example, we show only the first 3 characters and mask the remaining.

  • Input: ACC123456789

  • Output: ACC*********

Mask All Except Last N Characters

Keeps the last N characters visible, and masks the rest of the characters.

Example

In this example, we show the last 4 characters and mask the remaining characters.

  • Input: 9876543210

  • Output: ******3210

Mask Middle N Characters

Masks a group of N characters in the middle of each value.

Example

In this example, we mask the 4 middle characters and show the remaining characters.

  • Input: ABCD-1234-EFGH

  • Output: ABCD-****-EFGH

Mask First N% Characters

Masks the first N% of each value.

Example

  • Input: CustomerSupportTeam

  • Output: ********SupportTeam

Mask Last N% Characters

Masks the last N% of each value.

Example

  • Input: NorthAmericaRegion

  • Output: North*******Region

Pattern-Based Masking

Masks based on user-defined regex-like patterns.

Example

  • Input: +1-415-552-7890

    Pattern Rule

    +X-XXX-XXX-####

    (Keep last 4 digits visible; mask the rest.)

  • Output: +*-***-***-7890

By applying scrambling or masking, you can protect sensitive information while still enabling realistic data for development, testing, and analytics — helping your organization maintain compliance and build customer trust.

 

Related Topics Link IconRecommended Topics What's next? Data Integration using Unity Catalog